Searching and Indexing Genomic Databases via Kernelization

نویسندگان

  • Travis Gagie
  • Simon J. Puglisi
چکیده

The rapid advance of DNA sequencing technologies has yielded databases of thousands of genomes. To search and index these databases effectively, it is important that we take advantage of the similarity between those genomes. Several authors have recently suggested searching or indexing only one reference genome and the parts of the other genomes where they differ. In this paper, we survey the 20-year history of this idea and discuss its relation to kernelization in parameterized complexity.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fast Dictionary Lookup in Genomic Information Retrieval

and retrieval techniques for homology searching of genomic databases are increasingly important as the search tools are facing great challenges of rapid growth in sequence collection size. Consequently, the indexing and retrieval of possibly gigabytes sequences become expensive. In this paper, we present two new approaches for indexing genomic databases that can enhance the speed of indexing an...

متن کامل

وضعیت بازیابی اطلاعات در دو پایگاه نمایه و نما و سنجش اثربخشی استفاده از واژگان کنترل ‌شده در نمایه‌سازی این دو پایگاه

Purpose: This study was carried out to determine the level of precision, recall, and searching time for “Nama” and “Namayeh” databases, as well as to find out which of the indexing tools (thesaurus and Dewey decimal classification) helps us more in improvement of information retrieval. Methodology: This study is an analytical survey in which the necessary data was collected by direct observati...

متن کامل

Indexing and Retrieval for Genomic Databases

Genomic sequence databases are widely used by molecular biologists for homology searching Amino acid and nucleotide databases are increasing in size exponentially and mean sequence lengths are also increasing In searching such databases it is desirable to use heuristics to perform computationally intensive local alignments on selected sequences only and to reduce the costs of the alignments tha...

متن کامل

The Hybrid Digital Tree and Its Applications to Genomic Sequence Databases

THE HYBRID DIGITAL TREE AND ITS APPLICATIONS TO GENOMIC SEQUENCE DATABASES By Qiang Xue This dissertation focuses on index structures, search algorithms, and applications for large string databases whose indexes cannot fit entirely in the main memory (RAM). String searching is a classic research topic that has received increasing attention in recent years, due to the rapid growth of digital tex...

متن کامل

Protein Sequence Similarity Search Suitable for Parallel Implementation

Having entered the post genomic era, there lies a plethora of information, both genomic and proteomic. This provides quite a lot of resources so that the computational and machine learning strategies be applied to address the problems of biological relevance. Searching in biological databases for similar or homologous sequences is a fundamental step for many bioinformatics tasks. On discovery o...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 3  شماره 

صفحات  -

تاریخ انتشار 2015